{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Combining Factors from Multiple Sources"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Quantopian research environment is tailored to the rapid testing of predictive alpha factors. The process is very similar because it builds on zipline, but offers much richer access to data sources. The following code sample illustrates how to compute alpha factors not only from market data as previously but also from fundamental and alternative data."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Quantopian provides several hundred MorningStar fundamental variables for free and also includes stocktwits signals as an example of an alternative data source. There are also custom universe definitions such as QTradableStocksUS that applies several filters to limit the backtest universe to stocks that were likely tradeable under realistic market conditions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2018-07-23T04:36:01.647259Z",
     "start_time": "2018-07-23T04:36:01.555385Z"
    }
   },
   "outputs": [],
   "source": [
    "from quantopian.research import run_pipeline\n",
    "from quantopian.pipeline import Pipeline\n",
    "from quantopian.pipeline.data.builtin import USEquityPricing\n",
    "from quantopian.pipeline.data.morningstar import income_statement, operation_ratios, balance_sheet\n",
    "from quantopian.pipeline.data.psychsignal import stocktwits\n",
    "from quantopian.pipeline.factors import CustomFactor, SimpleMovingAverage, Returns\n",
    "from quantopian.pipeline.filters import QTradableStocksUS\n",
    "\n",
    "import numpy as np\n",
    "import pandas as pd\n",
    "from time import time"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# periods in trading days\n",
    "MONTH = 21\n",
    "QTR = 4 * MONTH\n",
    "YEAR = 12 * MONTH"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will use a custom `AggregateFundamentals` class to use the last reported fundamental data point. This aims to address the fact that fundamentals are reported quarterly, and Quantopian does not currently provide an easy way to aggregate historical data, say to obtain the sum of the last four quarters, on a rolling basis."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We will again use the custom `MeanReversion` factor from [01_single_factor_zipline](01_single_factor_zipline.ipynb). We will also compute several other factors for the given universe definition using the `rank()` method's mask parameter."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This algorithm uses a naive method to combine the six individual factors by simply adding the ranks of assets for each of these factors. Instead of equal weights, we would like to take into account the relative importance and incremental information in predicting future returns. The ML algorithms of the next chapters will allow us to do exactly this, using the same backtesting framework."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Execution also relies on run_algorithm(), but the return DataFrame on the Quantopian platform only contains the factor values created by the Pipeline. This is convenient because this data format can be used as input for alphalens, the library for the evaluation of the predictive performance of alpha factors."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class AggregateFundamentals(CustomFactor):  \n",
    "    def compute(self, today, assets, out, inputs):  \n",
    "        out[:] = inputs[0]\n",
    "\n",
    "        \n",
    "class MeanReversion(CustomFactor):\n",
    "    inputs = [Returns(window_length=MONTH)]\n",
    "    window_length = YEAR\n",
    "\n",
    "    def compute(self, today, assets, out, monthly_returns):\n",
    "        df = pd.DataFrame(monthly_returns)\n",
    "        out[:] = df.iloc[-1].sub(df.mean()).div(df.std())\n",
    "        \n",
    "        \n",
    "def compute_factors():\n",
    "\n",
    "    universe = QTradableStocksUS()        \n",
    "    \n",
    "    profitability = (AggregateFundamentals(inputs = [income_statement.gross_profit], \n",
    "                                           window_length = YEAR) / \n",
    "                     balance_sheet.total_assets.latest).rank(mask=universe)\n",
    "\n",
    "    roic = operation_ratios.roic.latest.rank(mask=universe)\n",
    "        \n",
    "    ebitda_yield = (AggregateFundamentals(inputs = [income_statement.ebitda], \n",
    "                                          window_length = YEAR) /\n",
    "                    USEquityPricing.close.latest).rank(mask=universe)\n",
    "\n",
    "    mean_reversion = MeanReversion().rank(mask=universe)\n",
    "    \n",
    "    price_momentum = Returns(window_length=QTR).rank(mask=universe)\n",
    "    \n",
    "    sentiment = SimpleMovingAverage(inputs=[stocktwits.bull_minus_bear],\n",
    "                                    window_length=5).rank(mask=universe)   \n",
    "\n",
    "    factor = profitability + roic + ebitda_yield + mean_reversion + price_momentum + sentiment    \n",
    "    \n",
    "    return Pipeline(\n",
    "        columns={'Profitability': profitability,  \n",
    "               'ROIC': roic,\n",
    "               'EBITDA Yield': ebitda_yield,\n",
    "               \"Mean Reversion (1M)\": mean_reversion,\n",
    "               'Sentiment': sentiment,\n",
    "               \"Price Momentum (3M)\": price_momentum,\n",
    "               'Alpha Factor': factor})    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "start_timer = time()\n",
    "start = pd.Timestamp(\"2015-01-01\")\n",
    "end = pd.Timestamp(\"2018-01-01\")\n",
    "results = run_pipeline(compute_factors(), start_date=start, end_date=end)\n",
    "results.index.names = ['date', 'security']\n",
    "print(\"Pipeline run time {:.2f} secs\".format(time() - start_timer))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "results.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.15"
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
